Global Deaths Due to Air Pollution

Elizabeth Bekele, Alison Cheek

2022-05-03

Introduction

Packages Required

#This will allow us to filter through our data 
library(tidyverse)
library(dplyr)
#This will help us plot figures to showcase our findings
library(ggplot2)
#This will help us organize and display our data as necessary 
library(knitr)
library(kableExtra)
#This expands our plot uses 
library(plotly)
#Scientific Notation Disabled 
options(scipen=999)

Deaths Data

Import the deaths-due-to-air-pollution data

deaths_df <- data.frame(read.csv("death-rates-from-air-pollution.csv"))

We are going to rename a few of the columns and glimpse the data

colnames(deaths_df) <- c("country", "acronym", "year", "total_deaths", "indoor_deaths", "outdoor_deaths", "ozone_deaths")

glimpse(deaths_df)
## Rows: 6,468
## Columns: 7
## $ country        <chr> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanist…
## $ acronym        <chr> "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG",…
## $ year           <int> 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1…
## $ total_deaths   <dbl> 299.4773, 291.2780, 278.9631, 278.7908, 287.1629, 288.0…
## $ indoor_deaths  <dbl> 250.3629, 242.5751, 232.0439, 231.6481, 238.8372, 239.9…
## $ outdoor_deaths <dbl> 46.44659, 46.03384, 44.24377, 44.44015, 45.59433, 45.36…
## $ ozone_deaths   <dbl> 5.616442, 5.603960, 5.611822, 5.655266, 5.718922, 5.739…

Data Variables

Variables that interest us here include:

World Population Data

Now, let’s take a look at the population data.

world_pop <- read.csv("population_total_long.csv")
glimpse(world_pop)
## Rows: 12,595
## Columns: 3
## $ Country.Name <chr> "Aruba", "Afghanistan", "Angola", "Albania", "Andorra", "…
## $ Year         <int> 1960, 1960, 1960, 1960, 1960, 1960, 1960, 1960, 1960, 196…
## $ Count        <int> 54211, 8996973, 5454933, 1608800, 13411, 92418, 20481779,…

To get a general idea of ‘deaths-dataframe’ we made, let’s make a plots to see what’s happening. This is a plot of indoor x outdoor deaths around the world by country.

This is a mess, and so we chose two countries from each continent (a high-population and a low-population country) to graph.

We selected a high population from each continent and used the formula below to determine the low population.

Low population = high population * .10

Country.Name Year Count
Australia 1997 18517000
Brazil 1997 167209040
Germany 1997 82034771
Nigeria 1997 113457663
Pakistan 1997 131057431
United States 1997 272657000
Country.Name Year Count
Canada 1997 29905948
Chile 1997 14786220
Sri Lanka 1997 18470900
Malawi 1997 10264906
New Zealand 1997 3781300
Serbia 1997 7596501

Combine Data Sets

First let’s look at a table of the high and low populated countries using the world population data set.
Country.Name Year Count
Australia 1997 18517000
Brazil 1997 167209040
Germany 1997 82034771
Nigeria 1997 113457663
Pakistan 1997 131057431
United States 1997 272657000
Country.Name Year Count
Canada 1997 29905948
Chile 1997 14786220
Sri Lanka 1997 18470900
Malawi 1997 10264906
New Zealand 1997 3781300
Serbia 1997 7596501
Next, we are going to see the death count for high and low populated countries using the deaths dataframe.
country acronym year total_deaths indoor_deaths outdoor_deaths ozone_deaths
Australia AUS 1997 22.43025 0.3222224 21.838737 0.3141838
Australia AUS 1998 21.50529 0.2839769 20.960276 0.3048918
Australia AUS 1999 20.40911 0.2590092 19.897091 0.2953354
Australia AUS 2000 19.39822 0.2398763 18.909240 0.2899216
Australia AUS 2001 18.58572 0.2234341 18.118700 0.2836469
Australia AUS 2002 18.11849 0.2105980 17.662269 0.2859938
Australia AUS 2003 17.23830 0.1937083 16.802536 0.2816949
Australia AUS 2004 16.34770 0.1760229 15.932077 0.2785466
Australia AUS 2005 15.41337 0.1599279 15.016089 0.2757150
Australia AUS 2006 14.92239 0.1496469 14.530223 0.2819060
Australia AUS 2007 14.92140 0.1449723 14.514884 0.3042005
Australia AUS 2008 14.64683 0.1383225 14.228709 0.3254648
Australia AUS 2009 14.11563 0.1259313 13.694572 0.3431982
Australia AUS 2010 13.57171 0.1174834 13.140380 0.3647233
Australia AUS 2011 13.72763 0.1119247 13.276676 0.3956796
Australia AUS 2012 12.65973 0.1018626 12.196401 0.4192914
Australia AUS 2013 11.87449 0.0973836 11.384154 0.4530427
Australia AUS 2014 11.47268 0.0931036 10.939491 0.5037056
Australia AUS 2015 11.27679 0.0886376 10.702072 0.5544068
Australia AUS 2016 10.58644 0.0844017 9.974549 0.5955779
Australia AUS 2017 10.79595 0.0833628 10.128111 0.6592419
Brazil BRA 1997 57.64589 26.4634509 28.615177 3.3000853
country acronym year total_deaths indoor_deaths outdoor_deaths ozone_deaths
Canada CAN 1997 21.92768 0.0877542 19.908473 2.1959403
Canada CAN 1998 21.65538 0.0824492 19.634839 2.2056813
Canada CAN 1999 21.17703 0.0751278 19.179045 2.1894261
Canada CAN 2000 20.26486 0.0681836 18.326999 2.1277328
Canada CAN 2001 19.82451 0.0641108 17.938427 2.0764642
Canada CAN 2002 19.52428 0.0604824 17.669133 2.0476034
Canada CAN 2003 19.17033 0.0564743 17.338627 2.0268644
Canada CAN 2004 18.40919 0.0513588 16.629516 1.9730254
Canada CAN 2005 17.79268 0.0481667 16.030102 1.9547116
Canada CAN 2006 17.14391 0.0447622 15.445519 1.8887355
Canada CAN 2007 16.93196 0.0435468 15.229981 1.8952587
Canada CAN 2008 16.51814 0.0407468 14.829238 1.8832421
Canada CAN 2009 15.76760 0.0380831 14.118647 1.8389200
Canada CAN 2010 14.88338 0.0340653 13.281852 1.7864304
Canada CAN 2011 14.59934 0.0319160 13.030477 1.7569979
Canada CAN 2012 13.82968 0.0307105 12.243601 1.7647269
Canada CAN 2013 12.97501 0.0288027 11.410021 1.7339970
Canada CAN 2014 12.61872 0.0276959 11.032571 1.7469907
Canada CAN 2015 12.21793 0.0270578 10.609097 1.7638948
Canada CAN 2016 11.00267 0.0251286 9.397502 1.7408337
Canada CAN 2017 10.71662 0.0247705 9.110733 1.7397181
Chile CHL 1997 44.35418 12.3262645 31.559124 0.6260233
Lastly, we will join the population and and deaths with its respected country.
country acronym year total_deaths indoor_deaths outdoor_deaths ozone_deaths Count
Australia AUS 1997 22.43025 0.3222224 21.838737 0.3141838 18517000
Australia AUS 1998 21.50529 0.2839769 20.960276 0.3048918 18711000
Australia AUS 1999 20.40911 0.2590092 19.897091 0.2953354 18926000
Australia AUS 2000 19.39822 0.2398763 18.909240 0.2899216 19153000
Australia AUS 2001 18.58572 0.2234341 18.118700 0.2836469 19413000
Australia AUS 2002 18.11849 0.2105980 17.662269 0.2859938 19651400
Australia AUS 2003 17.23830 0.1937083 16.802536 0.2816949 19895400
Australia AUS 2004 16.34770 0.1760229 15.932077 0.2785466 20127400
Australia AUS 2005 15.41337 0.1599279 15.016089 0.2757150 20394800
Australia AUS 2006 14.92239 0.1496469 14.530223 0.2819060 20697900
Australia AUS 2007 14.92140 0.1449723 14.514884 0.3042005 20827600
Australia AUS 2008 14.64683 0.1383225 14.228709 0.3254648 21249200
Australia AUS 2009 14.11563 0.1259313 13.694572 0.3431982 21691700
Australia AUS 2010 13.57171 0.1174834 13.140380 0.3647233 22031750
Australia AUS 2011 13.72763 0.1119247 13.276676 0.3956796 22340024
Australia AUS 2012 12.65973 0.1018626 12.196401 0.4192914 22733465
Australia AUS 2013 11.87449 0.0973836 11.384154 0.4530427 23128129
Australia AUS 2014 11.47268 0.0931036 10.939491 0.5037056 23475686
Australia AUS 2015 11.27679 0.0886376 10.702072 0.5544068 23815995
Australia AUS 2016 10.58644 0.0844017 9.974549 0.5955779 24190907
Australia AUS 2017 10.79595 0.0833628 10.128111 0.6592419 24601860
Brazil BRA 1997 57.64589 26.4634509 28.615177 3.3000853 167209040
country acronym year total_deaths indoor_deaths outdoor_deaths ozone_deaths Count
Canada CAN 1997 21.92768 0.0877542 19.908473 2.1959403 29905948
Canada CAN 1998 21.65538 0.0824492 19.634839 2.2056813 30155173
Canada CAN 1999 21.17703 0.0751278 19.179045 2.1894261 30401286
Canada CAN 2000 20.26486 0.0681836 18.326999 2.1277328 30685730
Canada CAN 2001 19.82451 0.0641108 17.938427 2.0764642 31020902
Canada CAN 2002 19.52428 0.0604824 17.669133 2.0476034 31360079
Canada CAN 2003 19.17033 0.0564743 17.338627 2.0268644 31644028
Canada CAN 2004 18.40919 0.0513588 16.629516 1.9730254 31940655
Canada CAN 2005 17.79268 0.0481667 16.030102 1.9547116 32243753
Canada CAN 2006 17.14391 0.0447622 15.445519 1.8887355 32571174
Canada CAN 2007 16.93196 0.0435468 15.229981 1.8952587 32889025
Canada CAN 2008 16.51814 0.0407468 14.829238 1.8832421 33247118
Canada CAN 2009 15.76760 0.0380831 14.118647 1.8389200 33628895
Canada CAN 2010 14.88338 0.0340653 13.281852 1.7864304 34004889
Canada CAN 2011 14.59934 0.0319160 13.030477 1.7569979 34339328
Canada CAN 2012 13.82968 0.0307105 12.243601 1.7647269 34714222
Canada CAN 2013 12.97501 0.0288027 11.410021 1.7339970 35082954
Canada CAN 2014 12.61872 0.0276959 11.032571 1.7469907 35437435
Canada CAN 2015 12.21793 0.0270578 10.609097 1.7638948 35702908
Canada CAN 2016 11.00267 0.0251286 9.397502 1.7408337 36109487
Canada CAN 2017 10.71662 0.0247705 9.110733 1.7397181 36540268
Chile CHL 1997 44.35418 12.3262645 31.559124 0.6260233 14786220

Death Count

Which country has the highest death count?

Let’s make a table depicting the high and low populated countries and their respected death count due to pollution.

country hp_average_death
Australia 17.76815
Brazil 48.42928
Germany 28.10988
Nigeria 112.30157
Pakistan 144.33463
United States 26.35827
country lp_average_death
Canada 18.18542
Chile 36.51321
Malawi 147.77167
New Zealand 15.92536
Serbia 80.66558
Sri Lanka 69.60383

Here’s a graph to clearly visualize the previous table

So we’ve looked at the deaths due to pollution, but what percentage of the population was affected?

Country.Name average_population
Australia 21217772
Brazil 189132292
Germany 81914540
Nigeria 148549958
Pakistan 168525322
United States 300447600
Country.Name average_population
Canada 33029774
Chile 16555805
Malawi 13605376
New Zealand 4214995
Serbia 7345882
Sri Lanka 19824652

Pollution Types

Which type of pollution has the greatest number of deaths?

country avg_indoor avg_outdoor avg_ozone
Pakistan 87.7427944 50.52063 10.440656
Nigeria 75.8755074 35.21678 2.117076
Brazil 19.4258385 26.84194 2.740342
Germany 0.7170881 25.47078 2.343892
Australia 0.2485867 17.20789 0.360452
United States 0.1656402 22.79947 3.915093
country avg_indoor avg_outdoor avg_ozone
Canada 0.0651156 16.38423 1.9697041
Chile 8.6932699 27.17442 0.8504919
Malawi 132.1891749 13.81151 3.3870514
New Zealand 0.2908622 15.56872 0.0727512
Serbia 35.8762796 42.71254 2.9395671
Sri Lanka 44.5428441 24.77233 0.4304406

Pollution Over Time

Let’s look at the previous two decades and compare the death count Has there been a change?

This is the first decade 1996-2006
country High_Deaths_96 High_Deaths_01 High_Deaths_06
Australia 23.04465 18.58572 14.92239
Brazil 60.67757 49.46436 41.46829
Germany 34.72325 28.38756 23.83654
Nigeria 136.08978 123.05129 102.26653
Pakistan 155.42988 151.25352 146.09296
United States 29.99271 28.93114 25.93369
country Low_Deaths_96 Low_Deaths_01 Low_Deaths_06
Australia 22.18101 19.82451 14.92239
Brazil 46.36829 37.43188 41.46829
Germany 183.14179 165.41702 23.83654
Nigeria 93.44700 83.18333 102.26653
Pakistan 85.28997 72.16239 146.09296
United States 100.66078 95.27073 25.93369
This is the second decade 2007-2017
country High_Deaths_07 High_Deaths_12 High_Deaths_17
Australia 14.92140 12.65973 10.79595
Brazil 40.42460 35.39069 30.32108
Germany 23.45850 20.91536 19.82826
Nigeria 98.90306 84.22324 81.22147
Pakistan 143.81724 133.93887 123.21548
United States 25.11756 21.98194 18.82515
country Low_Deaths_07 Low_Deaths_12 Low_Deaths_17
Canada 16.93196 13.82968 10.71662
Chile 30.53130 27.31475 24.29921
Malawi 132.12253 116.27470 104.93508
Serbia 76.65752 72.77354 62.57853
Sri Lanka 66.05987 59.22433 38.46264
Tonga 87.81178 79.49336 70.72940

Let’s graph the previous tables!

The first decade.

This shows the second decade.

By comparing each pollutant type, we can determine which year and country had the highest numbers of deaths

Indoor Deaths

Outdoor Deaths

Ozone Deaths

Which is worse?

outdoor or indoor pollution?

Let’s reintroduce a graph we looked at earlier. Instead this time we will combine the pollutant types together.

We cannot conclude which is worse.

We have this included already

#Mean total deaths from 1996-2017 of high-population countries
deaths_highpop_countries <- deaths_df %>% 
  filter(country %in% c('United States', 'Brazil', 'Nigeria', 'Germany', 'Pakistan', 'Australia')) %>% 
  group_by(country) %>% 
  select(total_deaths) %>% 
  summarize(average_death_high = mean(total_deaths))
## Adding missing grouping variables: `country`
#Mean total deaths from 1990-2017 of high-population countries
deaths_lowpop_countries<- deaths_df %>% 
  filter(country %in% c('Canada', 'Chile', 'Malawi', 'Serbia', 'Sri Lanka', 'New Zealand')) %>% 
  group_by(country) %>% 
  select(total_deaths) %>% 
  summarize(average_death_low = mean(total_deaths))
## Adding missing grouping variables: `country`
#Average Death of both High and Low Populated countries  
kable(list(deaths_highpop_countries, deaths_lowpop_countries))
country average_death_high
Australia 17.76815
Brazil 48.42928
Germany 28.10988
Nigeria 112.30157
Pakistan 144.33463
United States 26.35827
country average_death_low
Canada 18.18542
Chile 36.51321
Malawi 147.77167
New Zealand 15.92536
Serbia 80.66558
Sri Lanka 69.60383
ggplot(deaths_highpop_countries)+
  geom_col(mapping = aes(x=country, y=average_death_high))+
             xlab("Country")+
             ylab("Average deaths (per 100,000)")+
             ggtitle("Average total deaths in high-population countries")+
  coord_flip()

ggplot(deaths_lowpop_countries)+
  geom_col(mapping = aes(x=country, y=average_death_low))+
             xlab("Country")+
             ylab("Average deaths (per 100,000)")+
             ggtitle("Average total deaths in low-population countries")+
  coord_flip()

Summary

Sources